-
Notifications
You must be signed in to change notification settings - Fork 667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make scipy and matplotlib required dependencies (#1159) #1404
Conversation
7552db8
to
cf0e31a
Compare
.travis.yml
Outdated
- PIP_DEPENDENCIES='griddataformats' | ||
- CONDA_DEPENDENCIES="mmtf-python nose=1.3.7 mock six biopython networkx cython joblib nose-timer matplotlib scipy griddataformats" | ||
- CONDA_ALL_DEPENDENCIES="mmtf-python nose=1.3.7 mock six biopython networkx cython joblib nose-timer matplotlib netcdf4 scikit-learn scipy griddataformats seaborn coveralls clustalw=2.1" | ||
- PIP_DEPENDENCIES="" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can remove this line completely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good besides some minor comments
@@ -42,6 +42,7 @@ | |||
'contact_matrix', 'dist', 'between'] | |||
|
|||
import numpy as np | |||
import scipy.sparse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not from scipy import sparse?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the full imports much better if we only use a few things. Much easier to grep. And really not much more to write. Why not be explicit with where you get functions from? And it keeps the name space tidier.
(Sidenote: I particular dislike function imports – I think they are really bad because you have no idea where they come from when reading code. )
del msg | ||
|
||
import numpy as np | ||
import scipy.stats |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not from scipy.stats import gaussian_kde?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because that is evil.
Honestly, function imports at the module level are bad. See comments above. See PEP something or other IIRC...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PS: Please don't.
@@ -155,14 +155,16 @@ | |||
from __future__ import division, absolute_import | |||
from six.moves import zip | |||
import numpy as np | |||
import scipy.optimize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again why not import the one function we need?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above. Don't.
@@ -258,6 +257,10 @@ | |||
import logging | |||
from itertools import cycle | |||
|
|||
import numpy as np | |||
import matplotlib |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a common idom is import matplotlib as mpl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, but we don't use it a lot and it's clearer to read and to grep for: if I am looking for matplotlib I use git grep matplotlib
.
:func:`scipy.spatial.distance.directed_hausdorff` is an optimized | ||
implementation of the early break algorithm of [Taha2015]_; note that one | ||
still has to calculate the *symmetric* Hausdorff distance as | ||
`max(directed_hausdorff(P, Q)[0], directed_hausdorff(Q, P)[0])`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are both still rendered under notes in the sphinx output?
# all standard requirements are available through PyPi and | ||
# typically can be installed without difficulties through setuptools | ||
setup_requires=[ | ||
'numpy>=1.9.3', | ||
'numpy>=1.10.4', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you add this bump to the CHANGELOG? I know we did it a while again and forgot to change setup.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we did add it to the CHANGELOG alread.
@@ -128,7 +128,7 @@ def test_triangular_matrix(): | |||
|
|||
multiplied_triangular_matrix_2 = triangular_matrix_2 * scalar | |||
assert_equal(multiplied_triangular_matrix_2[0,1], expected_value * scalar, | |||
err_msg="Error in TriangularMatrix: multiplication by scalar gave\ | |||
err_msg="Error in TriangularMatrix: multiplication by scalar gave\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove the trailing \
and align with the opening brackets in the previous line
I pushed more commits to the branch. I don't know why they dont show up for the PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really dislike importing functions from external modules because it makes it hard to understand where they come from (and they pollute the name space, but that's minor... until you define similar sounding things yourself).
@@ -42,7 +42,7 @@ | |||
'contact_matrix', 'dist', 'between'] | |||
|
|||
import numpy as np | |||
import scipy.sparse | |||
from scipy import sparse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's ok, I can live with it.
If I take only few things I like fully qualified imports but this one is fine if you like it better.
@@ -176,7 +176,7 @@ | |||
import logging | |||
|
|||
import numpy as np | |||
import scipy.stats | |||
from scipy.stats import gaussian_kde |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't. This looses all the context and when browsing code it is not all clear where this function comes from. Is it defined locally? Is it imported from elsewhere?
I really hate it when used for things not in the standard library.
Please revert.
@@ -155,7 +155,7 @@ | |||
from __future__ import division, absolute_import | |||
from six.moves import zip | |||
import numpy as np | |||
import scipy.optimize | |||
from scipy.optimize leastsq |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't do this. See comment above.
package/MDAnalysis/analysis/pca.py
Outdated
@@ -106,7 +106,7 @@ | |||
import warnings | |||
|
|||
import numpy as np | |||
import scipy.integrate | |||
from scipy.integrate import simps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't. See above.
@@ -36,7 +36,7 @@ | |||
from six.moves import range | |||
|
|||
import numpy as np | |||
import scipy.optimize | |||
from scipy.optimize import curve_fit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't. See above.
a40dbf8
to
259cdbd
Compare
@kain88-de I do appreciate it that you humoured my personal pet-peeve with the imports. Apologies if my comments were not quite as civil as they could have been. Thanks for working with me on these boring last minute PRs! |
The travis failure is in full because it exceeded 50 mins ( #1394 ...). |
I will clean up the history. |
259cdbd
to
f8d5c5e
Compare
Compacted history and force-pushed. Should be ready for review and merge. |
Can we retarget this to 0.17.0 ? This takes consistently longer then 50 minutes to run our full test suite. I don't understand how this change suddenly added more then 10 minutes to our overall runtime. |
It looks to me that the overall +10 minutes comes from the fact that "minimal" is not so minimal anymore – having scipy enables many more tests. Minimal ran for 41 min and full timed out at 50 mins but in other cases full ran for 48:44 min. In the other case, minimal took 25:33 min. Running the setup (astropy helpers) took 441.98s for the time-out full, 328.76s for the minimal with scipy and 294.30s for the other full that passed but only 137.51s for the other minimal without scipy. I think we still have a general problem with the run time of full. I didn't quite understand the data in #1409 but if different installation of netcdf4 or travis caching #1405 could help then we should investigate further. I am not convinced that this PR in particular extends the installation time of full because these installation times have been fairly random as of late; the numbers that I quote above differ by almost 150s for full without any changes in the travis config that would affect the full build. I would advocate for adding it to 0.16.2 because it is ready and it will make things a bit easier for downstream packaging #1361. |
The runtime of full is too long even with the change in installation times. We used to be at ~30-35 minutes for a full test run. Our installation doesn't suddenly take more then 10 minutes. Also if full regularly runs into the time limit we won't know about our code coverage. |
Yes, true. But if understand you correctly then you're hypothesizing that it is due to this PR. The numbers above show that it is not due to this PR: the full one https://travis-ci.org/MDAnalysis/mdanalysis/builds/244102132 for PR #1403 also takes almost 50 minutes.
Fair enough, ~150 s is only 2:30 min. Did you look at what takes more time in the tests? From the 48 min full:
If you compare this to the minimal with scipy
There is a lot of variation: for instance
Yes, and that is definitely bad. Perhaps we need to split the build matrix into core and analysis and report coverage separately (or find a way to merge across builds). |
@MDAnalysis/coredevs this needs a review, thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except for the scipy version, everything looks OK.
@@ -494,11 +494,12 @@ def dynamic_author_list(): | |||
classifiers=CLASSIFIERS, | |||
cmdclass=cmdclass, | |||
requires=['numpy (>=1.10.4)', 'biopython', 'mmtf (>=1.0.0)', | |||
'networkx (>=1.0)', 'GridDataFormats (>=0.3.2)', 'joblib'], | |||
'networkx (>=1.0)', 'GridDataFormats (>=0.3.2)', 'joblib', | |||
'scipy', 'matplotlib (>=1.5.1)'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hausdorff distance was introduced in scipy 0.19.0, maybe we should specify the minimum version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still have our own hausdorff distance I think. You can open an issue for switching to the scipy implementation with a fallback to our own.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read a comment to fast; I thought we switched to the scipy version. My bad.
f8d5c5e
to
a07d665
Compare
@orbeckst, you didn't add yourself to the version's author list in the CHANGELOG. |
I added Oliver |
d378937
to
85c0c56
Compare
def setUp(self): | ||
sys.modules.pop('MDAnalysis.analysis.distances', None) | ||
|
||
@block_import('scipy') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was so proud of this decorator, and now we hardly need it 😆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, worry not, I'm opening a PR in 5 minutes where it is very useful!
Is adding dependencies allowed in a 16.x release or does it have to be 0.17? Otherwise looks good |
Well with strict following of SemVer it is not. Even deprecating publib API would have to be in 0.17.0 and not 0.16.2. http://semver.org/ I personally think this is less of an issue then the deprecations. |
Tests are passing now. the full build just does a timeout like usual. |
The FAQ says that changes in dependencies can be a minor or a patch level release; the latter if it fixes a bug.
Good to know about the depredations supposed to be in a new minor. The most important thing is that there is another minor release between deprecation and removal. But we cannot afford that much time for Timeseries removal, the other deprecations are for 1.0 and users will have time to adjust.
Nevertheless, because we are still in 0.x we are, at least technically, allowed to do whatever we want within semver. However, we are trying to conform as much as possible/feasible to help users.
…--
Oliver Beckstein
email: [email protected]
Am Jun 21, 2017 um 02:59 schrieb Max Linke ***@***.***>:
Is adding dependencies allowed in a 16.x release or does it have to be 0.17? Otherwise looks good
Well with strict following of SemVer it is not. Even deprecating publib API would have to be in 0.17.0 and not 0.16.2. http://semver.org/
I personally think this is less of an issue then the deprecations.
|
- updated all modules - removed any code that guards against scipy or matplotlib import - conforms to style guide https://github.com/MDAnalysis/mdanalysis/wiki/Style-Guide#module-imports-in-mdanalysisanalysis - fixes #1159 - fixes #1361
85c0c56
to
f2f45f8
Compare
If there are now more issues with this PR I will merge it in the afternoon |
I note that my normal development workflow has been disrupted by alterations to the installation process, with development branch
The issue is resolved by doing this first: Perhaps another dev can try a fresh conda environment without matplotlib to see if they can reproduce this. Granted, it is on WSL, but that's basically just ubuntu. If you can reproduce this, maybe open an issue if it seems annoying enough, but I guess the conda install is a simple enough fix. |
When I Have you tried |
Fixes #1159 and #1361
Changes made in this Pull Request:
PR Checklist